Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow mixture components to override cache_dir #754

Merged
merged 4 commits into from
Oct 5, 2024
Merged

allow mixture components to override cache_dir #754

merged 4 commits into from
Oct 5, 2024

Conversation

dlwh
Copy link
Member

@dlwh dlwh commented Oct 5, 2024

No description provided.

@@ -788,10 +796,23 @@ def build_caches(
if weight == 0 and split == "train":
continue

source_config_dict = source_config.__dict__
source_config_dict = dict(**source_config.__dict__)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add comment on why we need to delete it - to make the LMDatasetMixtureComponentConfig act like a LMDatasetSourceConfig?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

more or less. i also cleaned this part up a bit

@@ -524,13 +524,18 @@ def fsspec_expand_glob(url):
return urls


@dataclass
class LMDatasetMixtureComponentConfig(LMDatasetSourceConfig):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only thing different between the base and derived class is cache_dir? Seems weird since this on the surface doesn't seem to relate to mixtures...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah i can just put it in the base i guess

@dlwh dlwh merged commit 3bae9d3 into main Oct 5, 2024
8 checks passed
@dlwh dlwh deleted the override_cache branch October 5, 2024 04:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants